Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support MSA SIMD for MIPS #244

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Support MSA SIMD for MIPS #244

wants to merge 1 commit into from

Conversation

i-evi
Copy link

@i-evi i-evi commented Jun 1, 2021

--enable-msa now works in single and double precision for MIPS.
Tested on both 32-bit and 64-bit MIPS run in little-endian.

--enable-msa now works in single and double precision for MIPS.
Tested on both 32-bit and 64-bit MIPS run in little-endian.
@rdolbeau
Copy link
Contributor

rdolbeau commented Mar 2, 2023

@i-evi How is that testable? Is there support for MSA in Qemu (+ linux, toolchains, ...) so that it could be tested there? Is there some MSA hardware available?

@i-evi
Copy link
Author

i-evi commented Mar 3, 2023

@i-evi How is that testable? Is there support for MSA in Qemu (+ linux, toolchains, ...) so that it could be tested there? Is there some MSA hardware available?

@rdolbeau The MSA SIMD Instruction Set Architecture was introduced since MIPS Release 5. It provides support for 128-bit single- and double-precision floating-point vector arithmetic. We can see it on some MIPS processors, such as:

Ingenic X2000 (32 bit MIPS R5 compatible)
Loongson 3A4000 (64 bit MIPS R5 compatible)

I have the above two computing platforms, and tested fftw optimized with MSA SIMD on it, they passed all test cases, and achieved a certain performance improvement. Debian supports mipsel and mips64el architectures, so testing on the above platforms can be easily done by native compilation and running.

If it is difficult to obtain MIPS computer for testing, you can also use QEMU for testing.To cross-compile, on Debian, you can install the toolchain directly through apt:

apt install gcc-mipsel-linux-gnu
apt install gcc-mips64el-linux-gnuabi64

Since MIPS Release 6, the MIPS Instruction Set Architecture has undergone major changes, and the toolchain has also changed. Installing the MIPS R6 toolchain on Debian is as follows:

apt install gcc-mipsisa32r6el-linux-gnu
apt install gcc-mipsisa64r6el-linux-gnuabi64

Unfortunately, QEMU currently has limited support for MIPS MSA. If you are using Debian 11 (bullseye), it provides QEMU version 5.2.0, and only 64-bit MIPS R6 CPUs can execute MSA instructions correctly. So you need to set the parameter --host=mipsisa64r6el-linux-gnuabi64 when executing configure to compile using the MIPS R6 toolchain. Set the CPU parameters when executing qemu-mips64el-static, for example:

qemu-mips64el-static -cpu I6500

QEMU 7.2.0 supports MSA slightly better. Some MIPS R5 architecture CPUs can also execute MSA instructions correctly, such as Loongson-3A4000. Specify --host=mips64el-linux-gnuabi64 to compile using the traditional MIPS toolchain. However, for 32-bit MIPS CPUs, QEMU does not seem to provide correct support for MSA instructions at present, and development work is still in progress.

I tested the fftw optimized with MSA through QEMU, and it works fine on 64-bit MIPS CPUs that support MSA.

FFTW is an excellent project, it is widely used in digital signal processing related software. Although the development of MIPS over the years is not satisfactory, there are still a large number of MIPS chips in service or production. Thank you for your concern and hope fftw can support MIPS MSA.

@rdolbeau
Copy link
Contributor

rdolbeau commented Mar 5, 2023

@i-evi Thanks for the details; for QEmu I was hoping for full system emulation - it's a lot easier to test when running natively. I did some stuff in MSA for crypto (e.g. https://github.com/jedisct1/supercop/tree/master/crypto_stream/chacha20/dolbeau/mipsel-msa) and at the time couldn't get QEmu+Linux to support MSA... i it's possible, would you have an example command line for QEmu (I have 7.2.0) that would enable installing e.g. Debian mips64el ? (any other distro would do if it's easier).

Meanwhile I'd suggest updating the copyright in simd-msa.h; also if the code only support little-endian, it probably should fail explicitly in big-endian just in case.

@i-evi
Copy link
Author

i-evi commented Mar 5, 2023

@rdolbeau Testing with QEMU is a little more complicated. A simple way is that we can use binfmt_misc + QEMU.

Assuming you are using Debian (I recommend bookworm, the current sid, as it has the newer version 7.2.0 of QEMU), install the package qemu-user-static:

apt install qemu-user-static

The package manager configures binfmt support so that we can run binaries for any architecture supported by QEMU directly on the host. For example, we first cross-compile a simple program:

$ mips64el-linux-gnuabi64-gcc hello.c -static

In this way, we get an executable a.out:

$ readelf -h a.out 
ELF Header:
  Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 
  Class:                             ELF64
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              EXEC (Executable file)
  Machine:                           MIPS R3000
  Version:                           0x1
  Entry point address:               0x120003ae0
  Start of program headers:          64 (bytes into file)
  Start of section headers:          732264 (bytes into file)
  Flags:                             0x80000007, noreorder, pic, cpic, mips64r2
  Size of this header:               64 (bytes)
  Size of program headers:           56 (bytes)
  Number of program headers:         6
  Size of section headers:           64 (bytes)
  Number of section headers:         32
  Section header string table index: 31

We already have binfmt_misc support, so we can run it directly:

$ ./a.out 
hello!

This is equivalent to:

$ qemu-mips64el-static ./a.out 
hello!

But when we compile dynamically linked programs, we may encounter errors like this:

$ mips64el-linux-gnuabi64-gcc hello.c && ./a.out 
mips64el-binfmt-P: Could not open '/lib64/ld.so.1': No such file or directory

The toolchain provides these dynamic libraries, which you may find in /usr/Target Triplet/. We run our program via chroot:

$ cp /usr/mips64el-linux-gnuabi64/lib* ./ -a
$ sudo chroot ./ ./a.out
hello!

First, copy all the library files provided by the toolchain to "./", then we chroot "./" as "/", so that we can run the dynamically linked executable "a.out".

I used the above method to test fftw with QEMU. I configured --host=mips64el-linux-gnuabi64, cross-compiled fftw, and then tried to run make check to verify. Obviously, "tests/bench" can't be run directly, so I copied the required dynamic libraries to tests:

$ cp /usr/mips64el-linux-gnuabi64/lib* ./tests/ -a

To use chroot, I renamed "bench" to "bench.mips64el", then created the file "bench" and made it executable:

$ mv ./tests/bench ./tests/bench.mips64el
$ touch ./tests/bench && chmod a+x ./tests/bench

Write the following script in the "bench" file we just created:

#!/bin/bash
sudo QEMU_CPU=Loongson-3A4000 chroot ./ ./bench.mips64el $*

QEMU_CPU is used to specify the CPU model for the QEMU. In the example, we tested the MSA on the mips64el architecture, so we chose the Loongson-3A4000 CPU. At this point it's all done, run make check and we can see that the test is running.

If you need a more sophisticated software testing environment, you can try using the full linux rootfs. On Debian, first install debootstrap:

apt install debootstrap

Debootstrap can help you install rootfs with the specified architecture, such as mips64el(don't forget to install qemu-user-static):

debootstrap --arch=mips64el sid mips64el_root http://ftp.debian.org/debian

After running the above command we can get a Debian rootfs. We first mount some filesystems to ensure that the host's resources can be used properly in the chroot environment(optional, but I suggest you do):

$ sudo mount --rbind /proc mips64el_root/proc
$ sudo mount --rbind /dev mips64el_root/dev
$ sudo mount --rbind /sys mips64el_root/sys
$ sudo mount --rbind /tmp mips64el_root/tmp

Then we run chroot, through the arch command, we see that we are already running on the mips64 architecture:

$ sudo chroot mips64el_root
# arch
mips64

You can use the native platform in the chroot environment. For example use apt to install software:

# apt install gcc -y
...

# gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/mips64el-linux-gnuabi64/12/lto-wrapper
Target: mips64el-linux-gnuabi64
Configured with: ../src/configure...
...

By using binfmt_misc + QEMU, and chroot into the rootfs of another architecture is the easiest way to set up a cross-architecture environment in my opinion. The chroot environment and the host environment use the same kernel. If you use qemu-system, you may also need to prepare a bootable kernel.

@rdolbeau
Copy link
Contributor

rdolbeau commented Mar 6, 2023

@i-evi I'm of a different opinion on Qemu - I don't like modifying/cluttering my host for transient efforts, and I've found that full-system emulation is usually the easiest (and closest to real hardware configuration-wise). I do it routinely for Aarch64, PowerPC, RISC-V, SPARCv8 and 68k (and for those last two Qemu is even faster than vintage systems ;-) ), and it only requires a zvolume (ZFS) that can be easily snapshot'ed / backup'ed / migrated between hosts. I guess it all boils down to personal preferences :-)

I'll have to see if I can coerce Qemu in doing what I want...

@i-evi
Copy link
Author

i-evi commented Mar 7, 2023

@rdolbeau I agree with you. Using full system emulation on aarch64, riscv, etc. architectures is the convincing and easiest way. But for mips, the situation is a little different, QEMU's current support for mips is relatively poor. For 32-bit mips, run qemu-system-mipsel -M help and we can only see malta and mipssim available. The situation with 64-bit mips is not much better. If you want to run Debian with qemu-system-mips(64)el, you can follow the steps below.

First download the installer image from https://ftp.debian.org/debian/dists/stable/main/installer-mips64el/current/images/, I take loongson-3 as an example. Create a virtual disk image disk.img with qemu-img, and then run the following script:

qemu-system-mips64el \
        -M loongson3-virt \
        -m 2G \
        -kernel vmlinuz-5.10.0-20-loongson-3 \
        -initrd initrd.gz \
        -append "console=ttyS0,115200 loglevel=4" \
        -drive file=disk.img,format=raw,if=virtio \
        -serial mon:stdio -nographic \
        -device virtio-net-pci,netdev=net0 \
        -netdev user,id=net0,net=10.0.2.0/24,host=10.0.2.1,restrict=false

You will see the virtual machine boot, then go to the Debian installer and follow the normal installation steps to install Debian on your virtual disk.

Follow the installer to complete the system installation. Since the installer has no way to install the bootloader, we exit QEMU after completing the installation, and then copy the newly installed initrd and kernel from the virtual disk image. Take my disk.img as an example (the virtual disk image I use is in raw format):

$ mkdir root
$ fdisk -l disk.img
...
Device     Boot    Start      End  Sectors  Size Id Type
disk.img1           2048 14776319 14774272    7G 83 Linux
...
$ sudo mount -o loop,offset=$((2048*512)) disk.img root
$ cp root/boot/initrd.img root/boot/vmlinux ./
$ sudo umount root

Then use the new initrd and kernel, and add the path to root in the kernel parameters (I installed on /dev/vda1):

qemu-system-mips64el \
	-M loongson3-virt \
	-m 2G \
	-initrd initrd.img\
	-kernel vmlinux \
	-append "console=ttyS0,115200 root=/dev/vda1 rw loglevel=4" \
	-drive file=disk.img,format=raw,if=virtio \
	-serial mon:stdio -nographic \
	-device virtio-net-pci,netdev=net0 \
	-netdev user,id=net0,net=10.0.2.0/24,host=10.0.2.1,restrict=false

You will be able to boot into Debian just installed on the virtual disk.Using the same approach, you can also try installing Debian mipsel on the malta platform using qemu-system-mipsel.

I really don't recommend this though, because there are a lot of problems with SMP and memory support, you may only be able to use 1 CPU and no more than 2GB of memory. This is really annoying for slightly more complex software compilation and testing, because there are so few computing resources available.

Also, what is known is that even QEMU version 7.2.0 has some issues with MSA support for 32-bit mips. Among the 64-bit mips CPUs, the ones that can correctly support MSA instructions are Loongson-3A4000, I6400 and I6500, where I6400 and I6500 are mips r6 architecture processors, which do not seem to be compatible with existing mips64el software.

For the available platforms of qemu-system-mips64el, there are only loongson3-virt and boston. The bad news is that loongson3-virt can only use Loongson-3A1000 CPU, it does not support MSA. While boston supports the I6400 and I6500 CPUs of the mips r6 architecture, there is currently no mainstream linux distro that officially supports the mips r6 architecture.

Being able to do a full system emulation of mips using QEMU is certainly a good way. However, the current situation is that it seems difficult to test some advanced features such as MSA without spending a lot of time dealing with QEMU and kernel issues. As things stand, using qemu + chroot is a compromise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants